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PRIMERS- ATTACHED VECTOR ELONGATION (PAVE): 
A 5 -DIRECTED CDNA CLONING STRATEGY 

10 



FIELD OF THE INVENTION 
The present invention provides a novel method for preparing cDNA libraries 
containing enhanced percentages of full-length cDNA inserts. 

15 



BACKGROUND OF THE INVENTION 
Technology aimed at the production of cDNA libraries, which are important tools 
in the discovery of biologically relevant genetic sequences, often produces cDNA libraries 
2 0 that are far from perfect, cDNA libraries may contain a high percentage of molecules 
where the cDNA insert within the library vector is not full-length as compared to the 
naturally-occurring mRNA molecule from which the cDNA was derived. cDNA 
libraries, even those designed to be "directional" or having the cDNA insert present in 
a particular 5 / ->3 / orientation relative to the vector sequences, often contain a high 

2 5 percentage of "flipped " inserts where the cDNA insert is oriented in the opposite 

orientation from that which is most desirable for characterization and expression of the 
cDNA insert. In addition, some cDNA libraries demonstrate a high incidence of multiple 
inserts, where unrelated cDNA molecules are aberrantly ligated into the same vector 
molecule. 

3 0 There exists a need for novel methods of cDNA library production, and it is to 

such methods that the present invention is directed. 
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Construction of high quality cDNA libraries, with greater than 90% of 
the inserts being the full-length copy of the corresponding mRNA 
molecules, is crucial to the success of our effort to clone all the human 
genes encoding secreted proteins. Several factors contribute to the poor 
quality of cDNA. libraries constructed using the conventional method, i. e., 
cDNA synthesis followed by ligation into plasmid or phage vectors. First, 
mRNA molecules may be degraded during RNA isolation and in the process 
of first strand cDNA synthesis. In addition, most mRNA samples are 
isolated from total cellular RNA using the oligo-dT capture protocol and, 
therefore, contaminated with partially-precessed poly(A) containing 
precursor RNA and partially degraded 3' portion of mRNA molecules. " 
Second, during first-strand cDNA synthesis, reverse transcriptase tends to 
prematurely fall off the RNA templates due to RNA secondary structures or 
insufficient processivity of the enzyme itself. Third, the ligation step 
after ds cDNA synthesis may result in the following undesirable artifacts: 
A). Multiple cDNA inserts are ligated into the same vector due to the high 
insert/vector ratio used to increase the population of clones containing a 
cDNA insert. B). There is a high percentage (about 10%) of flipped cDNA 
inseri when a unidirectional library is constructed. C) Contaminating DNA 
can be incorporated into the library. For example, some of the early 
libraries constructed by Clontech were contaminated by yeast chromosome 
DNA when yeast tRNA was used to precipitated the cDNA. Another example 
is that when the full-length cDNA was selected (Carninci, et al., 1996), 
ligation of contaminating partial cDNA into the vector compromised the 
quality of library. D). There is a selection for smaller cDNA inserts since 
they are ligated more efficiently than larger ones. 

Numerous efforts have been taken to increase the cloning efficiency 
from a definite amount of mRNA and/or to increase the proportion of the 
full-length inserts. Some of the most successful approaches include: 
A). An engineered reverse transcriptase was designed by GIBCO-BRL to 
inactivate its RNase H activity, which causes on-template RNA cleavage 
and premature termination of transcription when the enzyme stutters 
before a secondary structure. Thus far, the Superscript II reverse 
transcriptase (BRL) remains the most popular enzyme for first-strand 
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cDNA synthesis. B). Oligo-dT tailed vectors were used for first-strand 
cDNA synthesis (Okayama and Berg, 1982; Alexander et al., 1984; 
Bellemare et al., 1991; Kato et al., 1994). This method dramatically 
increased the cloning efficiency and the proportion of insert-containing 
clones. C). Strategies for specific capture (Edery et al., 1995) or labeling 
of the 5'-end cap of mRNA molecules with oligonucleotides (Fromont- 
Racine et al., 1993; Liu and Gorovsky, 1993; Maruyama and Sugano, 1994; 
Kato et al., 1994) or biotin (Carninci et al., 1996, 1997) were used to 
select for full-length cDNA. Libraries constructed with a selection for the 
5'-end cap such as the Kato strategy (Kato et al., 1994, the Protagene 
protocol) and the biotin capture method (Carcinci et al., 1996) have a high 
percentage of full-length cDNA inserts ranging from 70% to 95%. 
However, none of the above mentioned strategies could completely satisfy 
the requirements for high efficiency, high proportion of full-length cDNA 
inserts and low contaminating or aberrant DNA inserts due to DNA 
ligation. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 is a schematic representation of the disclosed method for preparing 
mRNA molecules for cDNA library construction: mRNA is treated with phosphatase and 
5 then with pyrophosphatase, followed by ligation with RNA ligase to add an RNA tag to 
the 5* phosphate that will only be present on full-length mRNA molecules. 

Figure 2 is an autoradiography of a Northern blot showing the ligation of tobacco 
acid pyrophosphatase (TAP)-treated (lanes 1 and 2) or capped (no TAP treatment, lane 
3) rabbit globin mRNA with either an RNA tag (lanes i and 3) or a DNA tag (lane 2) using 
10 T4 RNA ligase. The blot was hybridized with an radioactively labeled oligodeoxy- 
nucleotide complementary to the tag sequence. The arrow points to the position of full- 
length tagged rabbit globin mRNA. This Northern blot indicates that TAP treatment is 
necessary for efficient RNA ligation, and that, as compared to DNA tags, RNA tags are 
more efficiently ligated to mRNA molecules by T4 RNA ligase. 
15 Figure 3 is a schematic representation of the pED6pdc4 vector that may be used 

for construction of cDNA libraries as disclosed herein, and includes the nucleotide 
sequence of the polylinker region of the pED6pdc4 vector. 

Figure 4 is a schematic representation of the pED6pdc2 vector from which the 
pED6pdc4 vector was derived, and includes the nucleotide sequence of the polylinker 
2 0 region of the pED6pdc2 vector. 

Figure 5 is another schematic representation of the pED6pdc2 vector and contains 
more information concerning the attributes of the pED6pdc2 vector. The pED6dpc2 vector 
was derived from pED6dpcl by insertion of a new polylinker to facilitate cDNA cloning 
(Kaufman et al, 1991, Nucleic Acids Res. 19: 4485-4490). 
25 Figure 6 is a nucleotide sequence alignment that shows in detail the nucleotide 

differences between the pED6pdc2 and pED6pdc4 vectors. 

Figure 7 is a schematic representation of the pED6pdc4 vector that may be used 
for construction of cDNA libraries as disclosed herein, and shows that the vector is 
digested with certain restriction enzymes and ligated to particular 5' and 3' linkers to form 
30 a pED6pdc4 vector-primer construct. 

Figure 8 is a schematic representation of the pAVEl vector that may be used for 
construction of cDNA libraries as disclosed herein, and shows that the vector is digested 
with certain restriction enzymes and ligated to particular 5' and 3' linkers to form a 
pAVEl vector-primer construct. 
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Figure 9 is a schematic representation of the pNOTs vector from which the pAVEl 
vector was derived. The pNOTs vector was derived from pMT2 (Kaufman et al, 1989, 
Moi Cc!L Biol. 9: 946-95S) by deletion of the DHFR sequences, insertion of a new 
polylinker, and insertion of the M13 origin of replication in the Clal site. 
5 Figure 10 is a schematic representation showing the creation of cDNA libraries by 

the combination of RNA-tagged mRNA molecules and pED6pdc4 vector-primer construct 
molecules, followed by first-strand synthesis (annealing and elongation by reverse 
transcriptase), RNAse digestion, intramolecular renaturation, and second-strand 
synthesis. 

1 0 Figure 11 is a schematic representation showing the creation of cDNA libraries by 

the combination of RNA-tagged mRNA molecules and pAVEl vector-primer construct 
molecules, followed by first-strand synthesis (annealing and elongation by reverse 
transcriptase), RNAse digestion, intramolecular renaturation, and second-strand 
synthesis. Note that in this figure the sequence at the 3' end of the Vector-Primer 

1 5 construct has been reversed: the 3' should be shown as NV(T) 48 as in the 3' linker shown 
in Figure 8. 

Figure 12 is an agarose gel of digested cDNA clones showing the results of using 
the Primers-Attached Vector Elongation (PAVE) strategy with RNA-tagged globin 
mRNA: approximately 80% of the globin cDNAs are the expected size for full-length 
2 0 cDNA inserts (arrow), while for the untagged RNA controls full-length cDNA inserts are 
present at a much lower frequency. 

Figure 13 shows schematically the structure of an RNA-tagged CPLA2-y mRNA 
molecule used in the experiments of Figures 13-17. 

Figure 14 shows schematically the structures and predicted sizes (as number of 

2 5 nucleotide residues) of different probe-RNA hybrids that could result from RN A-RN A 

ligation followed by RNAse digestion to remove single-stranded RNA. 

Figure 15 is a digitized scan of radioactively detected RNA molecules separated 
electrophoretically on a gel, showing the effect of ATP concentration upon the efficiency 
of the reaction adding a RNA tag to a mRNA molecule using T4 RNA ligase. Arrows 

3 0 show the expected sizes for ligated and unligated molecules. At a relative concentration 

of 0.1X (5.8 nM ATP), 50.8 percent of the radioactivity detected was present as ligated 
molecules as compared to unligated molecules. 

Figure 16 is a digitized scan of cDNA molecules separated electrophoretically on 
an agarose gel, showing that T7 polymerase is the most effective in completion of second- 
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strand synthesis as compared to T4, PFU (Promaga, Madison WI), and SEQUENASE 
(Amersham Pharmacia Biotech) DNA polymerases. 

Figure 17 is a digitized scan of cDNA molecules separated electrophoretically on 
a series of agarose gels, showing that the inclusion of tRNA in the RNAse digestion 
5 reaction prior to the second-strand synthesis reaction does not result in the inclusion of 
tRNA molecules in the cDNA reaction products. Further, this Figure shows that cDNA 
molecules produced without a second-strand synsthesis ("Annealed" in the Figure) are 
capable of being transformed into host cells and are maintained therein. 

10 DETAILED DESCRIPTION 

The following examples, tables, and figures provide examples of ways in which 
the methods of the present invention may be accomplished. These examples are not 
intended to limit in any manner the number of ways in which these methods may be 
1 5 carried out by those of skill in the art, or the types of vectors, primers, and other materials 
that may be utilized in these methods. In particular, those of skill in the art will appreciate 
that by selecting different sequences for the 5' and 3' linkers (also interchangeably called 
primers throughout) of the present method, linkers (or primers) can be designed that will 
anneal to any vector of known nucleotide sequence digested with any particular 

2 0 restriction enzyme(s) . 

For example, the invention also includes polynucleotides with sequences 
complementary to those of the polynucleotides disclosed herein. The present invention 
also includes polynucleotides which are derived from the polynucleotides disclosed 
herein by any of the following or by a combination thereof: addition of residues; deletion 
25 of residues; substitution of residues, whether with polynucleotide residues or other 
molecules such as amino acids, carbohydrates, lipids, or modified forms thereof; or 
chemical modification of existing residues. Examples of chemical modifications include 
but are not limited to methylation, addition of other alkyl groups, addition of aromatic or 
heterocyclic molecules, addition or removal of a hydroxyl group, addition of polyethylene 

3 0 glycol, addition of carbohydrate, polypeptide, or lipid molecules, etc. 

The present invention also includes polynucleotides that hybridize under reduced 
stringency conditions, more preferably stringent conditions, and most preferably highly 
stringent conditions, to polynucleotides described herein. Examples of stringency 
conditions are shown in the table below: highly stringent conditions are those that are at 
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least as stringent as, for example, conditions A-F; stringent conditions are at least as 
stringent as, for example, conditions G-L; and reduced stringency conditions are at least 



as stringent as, for example, conditions M-R. 



Stringency 
Condition 


Polynucleotide 
Hybrid 


Hybrid 
Length 
(bp)* 


Hybridization Temperature and 
Buffer 


Wash 

Temperature 
and Buffer' 


A 


DNA:DNA 


2 50 


65*C; lxSSC -or- 

42°C; lxSSC 50% formamide 


65*C;0.3xSSC 


B 


DNA:DNA 


<50 


V; lxSSC 


T B *; lxSSC 


C 


DNA:RNA 


2 50 


67°C; lxSSC -or- 

45 °C; lxSSC 50% fonnamide 


67°C; OJxSSC 


D 


DNA:RN'A 


<50 


T D *; lxSSC 


V; lxSSC 


E 


RNA:RNA 


2 50 


70°C; lxSSC -or- 

50°C; lxSSC 50% formamide 


70°C; 0.3xSSC 


F 


RNA:RN"A 


<50 


Tc*; lxSSC 


T/; lxSSC 


G 


DNA:DN T A 


^50 


65 °C; 4xSSC -or- 

42°C; 4xSSC, 50% formamide 


65 3 C; lxSSC 


H 


DNA:DNA 


<50 


T H *;4xSSC 


T H *;4xSSC 


1 


DNArRNA 


2 50 


67°C;4xSSC-or- 

45°C; 4xSSC, 50% formamide 


67°C; lxSSC 


J 


DNA:RNA 


<50 


T/^xSSC 


Tj*;4xSSC 


K 


RNA:RNA 


2 50 


70°C; 4xSSC-or- 

50°C; 4xSSC 50% formamide 


67°C; lxSSC 


L 


RNA:RNA 


<50 


T.*; 2xSSC 


T L *;2xSSC ! 


M 


DNArDNA 


2 50 


50°C; 4xSSC -or- 

40° C; 6xSSC, 50% formamide 


50°C;2xSSC 


N 


DNA:DNA 


<50 


T N *; 6xSSC 


T N *;6xSSC 


0 


DNA:RNA 


2 50 


55*C;4xSSC -or- 

42°C; 6xSSC, 50% formamide 


55 Q C;2xSSC 


P 


DNA:RNA 


<50 


T P *;6xSSC 


T P *;6xSSC 


Q 


RNA:RNA 


2 50 


60°C;4xSSC -or- 

45 °C; 6xSSC, 50% formamide 


60°C; 2xSSC 


R 


RN'ArRNA 


<50 


T R *; 4xS5C 


T R *;4xSSC 



25 ; : The hybrid length is that anticipated for the hybridized region(s) of the hybridizing polynucleotides. When 
hybridizing a polynucleotide to a target polynudeotide of unknown sequence, the hybrid length is assumed 
to be that of the hybridizing poivnudeotide. When polynucleotides of known sequence are hybridized, the 
hybrid length can be determined by aligning the sequences of the polynucleotides and identifying the region 
or regions of optimai sequence complementarity. 

3 0 • : SSPE (lxSSPE is 0.15M NaCl, lOmM NaH : PO v and 1.25mM EDTA, pH 7.4) can be substituted for SSC 
(lxSSC is 0.15M NaCl and 15mM sodium citrate) in the hybridization and wash buffers; washes are 
performed for 15 minutes after hybridization is complete. 
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*T B - T R : The hybridization temperature for hybrids anticipated to be less than 50 base pairs in length should 
be 5-1 0°C less than the melting temperature (T m ) of the hybrid, where is determined according to the 
following equations. For hybrids less than 18 base pairs in length, T m (°C) = 2(# of A + T bases) + 4(# of G -r 
C bases). For hybrids between 18 and 49 base pairs in length, T m (°C) = 81.5 16.6(log in (NV]) * 0.41(%G-rC) - 
5 (600/\ : ), where N* is the number of bases in the hybrid, and [W] is the concentration of sodium ions in the 
hybridization buffer ([NV] for lxSSC = 0.165 M). 

Additional examples of stringency conditions for polynucleotide hybridization are 
provided in Sambrook, J., E.F. Fritsch, and T. Maniatis, 1989, Molecular Cloning: A 

10 Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
chapters 9 and 11, and Current Protocols in Molecular Biology, 1995, F.M. Ausubel et al., eds., 
John Wiley & Sons, Inc., sections 2.10 and 6.3-6.4, incorporated herein by reference. 

Preferably, each such hybridizing polynucleotide has a length that is at least 
25%(more preferably at least 50%, and most preferably at least 75%) of the length of the 

15 polynucleotide of the present invention to which it hybridizes, and has at least 60% 
sequence identity (more preferably, at least 75% identity; most preferably at least 90% or 
95% identity) with the polynucleotide of the present invention to which it hybridizes, 
where sequence identity is determined by comparing the sequences of the hybridizing 
polynucleotides when aligned so as to maximize overlap and identity while minimizing 

2 0 sequence gaps. 

In particular, sequence identity may be determined using WU-BLAST 
(Washington University BLAST) version 2.0 software, which builds upon WU-BLAST 
version 1.4, which in turn is based on the public domain NCBI-BLAST version 1.4 
(Altschul and Gish, 1996, Local alignment statistics, Doolittle ed., Methods in Enzymology 

2 5 266: 460-480; Altschul et al, 1990 ? Basic local alignment search tool, Journal of 

Molecular Biology 215: 403-410; Gish and States, 1993, Identification of protein coding 
regions by database similarity search, Nature Genetics 3: 266-272; Karlin and Altschul, 
1993, Applications and statistics for multiple high-scoring segments in molecular 
sequences, Proc. Natl Acad. ScL USA 90: 5873-5877; all of which are incorporated by 

3 0 reference herein). WU-BLAST version 2.0 executable programs for several UNIX 

platforms can be downloaded from ftp://blast.wustl.edu/blast/executables. The complete 
suite of search programs (B LAS TP, BLASTN, BLASTX, TBLASTN, and TBLASTX) is 
provided at that site, in addition to several support programs. WU-BLAST 2.0 is 
copyrighted and may not be sold or redistributed in any form or manner without the 
35 express written consent of the author, but the posted executables may otherwise be freely 
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used for commercial, nonprofit, or academic purposes. In all search programs in the suite 
BLASTP, BLASTN, BLASTX, TBLASTN and TBLASTX - the gapped alignment 
routines are integral to the database search itself, and thus yield much better sensitivity and 
selectivity while producing the more easily interpreted output. Gapping can optionally be 
5 turned off in all of these programs, if desired. The default penalty (Q) for a gap of length 
one is Q=9 for proteins and BLASTP, and Q=10 for BLASTN, but may be changed to any 
integer value including zero, one through eight, nine, ten, eleven, twelve through twenty, 
twenty-one through fifty, fifty-one through one hundred, etc. The default per-residue 
penalty for extending a gap (R) is R=2 for proteins and BLASTP, and R=10 for BLASTN, 

10 but may be changed to any integer value including zero, one, two, three, four, five, six, 
seven, eight, nine, ten, eleven, twelve through twenty, twenty-one through fifty, fifty-one 
through one hundred, etc. Any combination of values for Q and R can be used in order to 
align sequences so as to maximize overlap and identity while minimizing sequence gaps. 
The default amino acid comparison matrix is BLOSUM62, but other amino acid 

15 comparison matrices such as PAM can be utilized. 

A number of types of cells may act as suitable host cells to be transformed with the 
products of the cDNA library preparation reactions. Mammalian host cells include, for 
example, monkey COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, 
20 human epidermal A431 cells, human Colo205 cells, 3T3 cells, CV-1 cells, other 
transformed primate cell lines, normal diploid cells, cell strains derived from in vitro 
culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, 
HaK or Jurkat cells. Alternatively, it may be possible to use host cells such as lower 
eukaryotes like yeast or prokaryotes such as bacteria. Potentially suitable yeast strains 

2 5 include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, 

or any yeast strain capable of being transformed with cDNA clones. Potentially suitable 
bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any 
bacterial strain capable of being transformed with cDNA clones. 

3 0 Patent and literature references cited herein are incorporated by reference as if 

fully set forth. 
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In this proposal, we describe an improved strategy (compared to Kato 
et al., 1994) that we call £rimers-Attached-Vector EJongation (PAVE). 
The crucial element of the strategy is a novel vector attached with 
primers for both first strand and second strand cDNA synthesis . The 
oiigo-dT primer attached to one end of the vector is used to prime first- 
strand cDNA synthesis from the poly(A) stretch of the mRNA, whose cap 
has been specifically labeled with a 27-mer biotinylated RNA tag. After 
digestion of the single-stranded RNA with RNase I, full-length cDNA is 
captured by streptavidin beads. Second strand synthesis is then carried 
out using the primer (with sequence identical to the RNA tag) at the other 
end of the vector, which would specifically base pair with a full-length 
cDNA that contains a sequence complementary to the RNA tag. This will 
give rise to a circularized plasmid for subsequent E. coli transformation. 
Since no DNA ligation will be necessary after cDNA synthesis, all the 
possible artifacts generated by cDNA-vector ligation will be theoretically 
eliminated. In addition, the availability of double-strand vectors 
containing single-strand cDNA inserts before the second strand cDNA 
synthesis provides a mechanism for library normalization and 
subtraction and would also allow subgrouping the cDNA libraries into the 
subset encoding secreted and membrane proteins and the subset encoding 
soluble proteins. 
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Examples 

Example 1 
Preparation of Vector-Primer 

Plasmid vector pED6dpc4 was completely digested with EcoR I and 
Sal I. Thirty micrograms of digested plasmid DNA was then ligated with 
840 pmol each of the following two linkers: 

Linker 1 

-i^TTULA^ - (SEQ.ID. #1) 

3 ' -<3C7IC-i2TTGIGrtGCriOQSG-5 ' (ESQ. ID. #2) 

Linker 2 

5 ' -Cl?ATC^.TCCC<^i7rGGTftC-3 ' (SEQ.3D. #3 ) 
3 ' - (T) 30 GATI£j^£TAG^^ ' -Phosphate (SBQ. ID. #4) 

in a 1.4 ml reaction volume using T4 DNA ligase (NEB) under conditions 
suggested by the manufacturer. The ligated plasmid DNA was then 
purified through electrophoresis on a 0.8% agarose gel. 

Example 2 

Ligation of a Biotinylated RNA Tag to the 5'-end of Full-length mRNA 

Ten ug of rabbit globin mRNA was treated with 5 units of HK 
phosphatase (Epicentre) in a total volume of 250 ul under conditions 
recommended by the manufacturer. After incubation at 37 oC for 30 min, 
the mixture was extracted with phenol/chloroform and precipitated with 
NaOAc/ethanol. The pellet was dissolved in 20 ul of DEPC-treated water 
and 19.5 ul of which was subjected to digestion with 5 units of tobacco 
acid pyrophosphatase (TAP) in a 50 ul volume. The reaction was carried 
out at 37 oC for 30 min and terminated by phenol/chloroform extraction. 
After NaOAc/ethanol precipitation, the pellet was dissolved in 20 ul of 
DEPC-treated water. Fifteen ug of TAP treated RNA was then ligated to 7 
ug of RNA tag (27-mer synthetic ribonucleotide with 5' biotin group) in a 
120 ul reaction mixture containing 50 mM Tris-CI, pH 7.8, 10 mM MgCI2, 
10 mM DTT, 1 mM ATP and 12 units of T4 RNA ligase (Takara). After 
overnight incubation at room temperature, the sample was extracted 
twice with phenol/chloroform and precipitated with NaOAc/ethanol. The 
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pellet was dissolved in DEPC-treated water. 

As a control experiment, 2.5 ug of the TAP treated RNA was ligated 
to 2.5 ug of 5' biotinylated DNA tag in a reaction volume of 40 ul and the 
sample was treated as described above. 

To assess the efficiency for ligating the RNA or DNA tag to rabbit 
globin mRNA, 0.25 ug of the RNA samples were electrophoresized on a 4- 
20% TBE/PAGE minigel (Novex) and blotted onto nylon-plus membrane 
(QIAGEN). After hybridization with 32P-labeled anti-tag (SEQ ID # 5'- 
GAGGCGTATCAGCTGGTCACT-3') according to Sambrook et al., 1989, the 
position of mRNA molecules ligated with either the RNA or DNA tag was 
revealed by autoradiography. As judged from Figure 4, RNA tag is ligated 
to the TAP-treated mRNA much more efficiently than the DNA tag. 

Example 3 
cDNA Synthesis and Cloning 

Approximately 1.25 ug of biotin-RNA-tagged mRNA was mixed with 
1.2 ug of vector-primer in a final volume of 20 ul containing 50 mM Tris- 
Cl, pH 8.3, 75 mM KCI, 3 mM MgCI2, 10 mM DTT, 0.5 mM each of the four 
dNTPs and 200 units of Superscript II (GIBCO BRL) and the reaction was 
carried out at 48 oC for 1 hour. The cDNA was then extracted with 
phenol/chloroform and precipitated with ethanol. The pellet was dissolved 
in water and digested with 25 units of RNase One (Promega) and 6 units of 
E. Coli RNase H (Epicentre) in 60 ul of reaction mixture containing 10 mM 
Tris-CI, pH 7.9, 10 mM MgCI2, 50 mM NaCI and 1 mM DTT. After 1 hour 
incubation at 37 oC, 30 ul of water and 10 ul of 10 X annealing buffer (0.5 
M Tris-CI, pH 8.0, 0.1 M MgCI2 and 0.5 M NaCI) were added and the mixture 
was heated at 70 oC for 5 min and slowly cooled down to 50 oC in 30 min. 
Ten ug of glycogen was then added the DNA was precipitated in 
NaOAc/ethanol. 

For second-strand cDNA synthesis, the above DNA pellet was 
dissolved in 13 ul of water and 2 ul of 10 X T4 DNA polymerase buffer 
(NEB), 4 ul of dNTPs (2.5 mM each), 1 ul of 1 mg/ml of BSA and 1 ul (3 
units) of T4 DNA polymerase were subsequently added. After 1 hr at 37 oC, 
the DNA was precipitated and used to transform competent E. coli cells 
(DH10B, GIBCO BRL). 

When tagged rabbit globin mRNA was used in the above procedure, 
the efficiency of the library is about 106 colonies/ug of starting mRNA. 
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When plasmids were isolated from randomly picked individual 
colon.es and digested withAsc I and Not I to release the insert, 37 out of 

r° ne ?n!u' U "' len9th (3b0Ut 650 bp) CDNA inserts - ,n addition, 5'-end 
and 3 -end DNA probes were used to hybridize to duplicate filters lifted 
from plated colonies and 75.8% of the colonies are full-length as judged 
by being able to hybridize to both probes (Table 1) 
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Experimental Design And Expected Results 

I. Construction of a multi-purpose vector (pAVE1) for in vitro 
and in vivo protein expression 

A vector pAVE1 has been constructed for our large scale 

molecular biology effort to obtain the full-length cDNAs of all the human 
secreted proteins in a single cloning step. pAVE1 is derived from pNOTS 
by replacing its Pst l/Xho I fragment with a 100 bp designed linker. Some 
of the notable features of pAVE1 include: 

A) . T7 and T3 RNA polymerase promoters flanking the cDNA insert to 
be cloned from 5' to 3' into the Eco Rl and Kpn I sites, allowing sense and 
anti-sense RNA molecules to be synthesized, respectively. The T7 RNA 
promoter also allows coupled in vitro transcription and translation (TNT) 
protocol to be used to assess the size of the encoded protein products. 

B) . Four eight-base recognizing restriction sites flanking T7 and T3 
promoters, permitting easy subcloning of the cDNA inserts. 

C) . Suitable for COS expression because of the SV40 origin and the 
eukaryotic expression cassette. 

D) . The fl origin (from the pNOTS backbone) would allow ssDNA to be 
prepared for library subtraction and normalization. In addition, 
recombinant f1 phage particles can be used to transfect COS cells 
(Yokoyama-Kobayashi and Kato, 1993). If we could engineer a patentable 
COS cell line that can specifically and efficiently endocytosize f1 phage 
particles, then we can carry out COS transfection in a large scale fashion 
without the need for plasmid preparation. 

II. Preparation of primers-attached-vector 

Eco Rl and Kpn I digested pAVE1 piasmid DNA will be gel-purified and 
ligated to the 5'-end linker, which is compatible with the Eco Rl end and 
contains a single-stranded sequence identical to the RNA tag, and to the 
3'-end linker,, which is compatible with the Kpn I end and contains single- 
strand oligo-dT sequence. The ligated DNA product will be gel-purified 
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and the presence of the primers will be confirmed by digestion with Hind 
III and Bst XI fcilowed by poiyacryiamice gel analysis More 
than 90% of the vector should be attached with the two primers if the 
proper linker/Vector ratio is used. Otherwise, the desired primers- 
attached vector DNA should be purified by consecutive oligo-dA column and 
anti-RNA tag oligonucleotide column. 

III. Tagging the cap of the mRNA with oligoribonucleotides 

The mRNA samples will be treated with the heat-killable (HK) 
phosphatase isolated from an antarctic bacterium (Epicenter) to remove 
the phosphate group at the 5' ends of degraded RNA molecules 
The cap of the full-length RNA population will be removed with tobacco 
acid pyrophosphatase (TAP; Shinshi et al., 1976a and 1976b; Efstratiadis 
et al., 1977; Fromont-Racine, et al., 1993; Maruyama and Sugano, 1994; 
Kato et al., 1994). The decapped mRNA molecules will then be ligated to a 
27-mer biotinylated oligoribonucleotide (RNATAG, using T4 

RNA ligase. The small RNA tag was the removed by repetitive ethanol 
precipitation. 

There are two limitations for this procedure, i. e. the low ligation 
efficiency (about 60%, Tessier, et al., 1986) and the small proportion of 
mRNA-mRNA ligation. However, since selection of full-length cDNA will be 
applied after first strand cDNA synthesis (RNase I digestion followed by 
streptavidin capture) and during second strand synthesis (specific priming 
from the vector-attached primer), this may not have a great detrimental 
effect on the quality of the cDNA library (although it can reduce the 
number of colonies produced from a definite amount of mRNA). 

IV. First strand cDNA synthesis and full-length cDNA enrichment 

The tagged mRNA will be annealed to the primers-attached-pAVE1 
vector and first strand cDNA synthesis will be carried out using 
Superscript II reverse transcriptase (GIBCO-BRL, ). The first 

strand cDNA, together with the associated mRNA template, will be 
precipitated and subject to RNase I digestion to degrade unprotected 
single-strand RNA regions as well as unreacted free mRNA molecules 

In this reaction, only the biotin group of the mRNA whose cDNA 
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is full-length will be protected from clipping off the vector-primer-cDNA 
assembly. The full-length cDNA-vecicr molecules will then be captured 
using streptavidin magnetic beads and subject to complete RNase H and 
alkaline hydrolysis to remove the RNA strand. This will produce a 
population of single-strand full-length cDNA covalently linked to the 
pAVE1 vector through the poly (A/T) region. The full-length cDNA 
population will account for about 7-10% of the total cDNA synthesized by 
reverse transcriptase according to Carninci et al., 1996. 

V. Second strand cDNA synthesis and transformation 

The cDNA-vector molecules will be diluted, denatured and reannealed 
to allow base pairing between the vector-attached primer and the 
extreme 3' end of the single-strand full-length cDNA . Second 

strand cDNA will be synthesized using T4 DNA polymerase. The resulting 
double-stranded circular DNA (with two gaps at each end of the cDNA) will 
be used to transform E. coli strain 10B or DH5a. More than 106 primary " 
colonies should be obtained for each microgram of vector-primer. 

VI. Assessment of the quality of the cDNA library 

A) . Globin mRNA control 

Pure globin mRNA (about 700 bases for both subunits) will be used to 
prepare a PAVE cDNA library. Duplicate filters from plates containing a 
total number of at least 10, 000 colonies will be hybridized with the 5'- 
end probe and the 3'-end probe, respectively. The ratio of 5'-end positive 
clones to the 3'-positive clones should be close to 1. At least 100 primary 
colonies will be picked for plasmid DNA preparation. Insert size will be 
determined by Asc I/Not I digestion. At least 90% of the colonies should 
have a full-length cDNA insert. 

B) . A real cDNA library 

A PAVE cDNA library will be made from some mRNA isolated from a 
human tissue source, preferably pancreas. The GAPDH 5'- and 3'-end probes 
will be used for colony hybridization to assess the ratio of clones 
containing GAPDH cDNA inserts with o and 3' sequences. If the ratio is 
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close to 1 as expected, 300 colonies will be randomly picked from the 
entire library for plasmid preparation and the insert size will be 
determined for each clone. More than 95% of the clones should have a cDNA 
insert. In addition, the plasmid DNA sample will be subject to coupled in 
vitro transcription and translation (TNT) analysis in the presence of 35S- 
labeled methionine. The size of the synthesized protein will be analyzed by 
4-20% SDS-PAGE followed by autoradiography. If more than 90% of the 
insert-containing clones give rise to a protein product in the TNT assay, 
3000 colonies will be subjected to 5'-end sequencing and the data will be 
subjected to bioinformatics evaluation. 

An additional, and perhaps more rigorous, approach to evaluate 
quality of the library is to screen for the presence of the 7 kb full-length 
cDNA for human cPLA2(3. whose mRNA is ubiquitously expressed but most 
abundant in pancreas. Previous effort > has produced more than 100 

positive clones from four cDNA libraries and none of them is full-length 
(Song, Khz, Bean and Knopf, unpublished). 

Future Considerations 



the following efforts should be considered to expedite our progress in 
cloning all the human cDNAs for secreted or membrane proteins and to 
facilitate their functional analysis: 

I. Enrichment of cDNAs for secreted and membrane proteins 

Strategy 1: Highly pure rough ER will be isolated by refining the 
sucrose-density centrifugation parameters. The mRNA molecules will be 
isolated, their poly A tails removed by oligo (dT)-directed RNase H 
digestion and the 5'-end cap labeled by biotin (Carninci, et al., 1996). The 
labeled rough ER mRNA will be hybridized with the single-stranded cDNA- 
vector population prepared from high quality total mRNA. After capture 
with streptavidin beads, the bound cDNA will be eluted and used to prepare 
a subset of cDNA library which should be highly enriched in cDNA 
molecules for secreted or membrane-bound proteins. 



Strategy 2: Explore the possibility of in vitro TNT based library 
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subgrouping: Plasmid DNA from a PAVE cDNA library will be prepared and 
subject to in viiro TNT for a defined length of time. Inhibitors for T7 RNA 
polymerase and the translation machinery will be added to freeze the 
cDNA-RNA-nascent peptide complex. If the nascent peptide contains a 
secretion signal, the complex will be captured by a solid phase conjugated 
with signal recognition particle (SRP). The captured cDNA-vector 
population will be used to transform E. coli cells to create a subset 
enriched in cDNAs for proteins with a signal peptide. 

II. Subtraction 

The full-length cDNA clones for the most abundant mRNA species 
will be obtained when we sequence our first 3000 clones for library 
quality assessment. These clones will be collected and biotinylated sense 
RNA transcripts will be made from the Not I linearized plasmid DNA using 
T7 RNA polymerase. After removal of the 5' and 3' vector sequences on the 
RNA using an oligonucleotide-directed RNase H digestion approach, the " 
remaining RNA will be used to subtract their corresponding cDNAs from 
the single-strand cDNA-vector population. The remaining cDNA-vector 
population should be enriched with rare messages. 

III. Normalization 

Normalization of PAVE libraries could be carried out before the 
initial bacteria transformation step, unlike in the original normalization 
protocol where amplified single-strand phagmid DNA was used (Soares, et 
al., 1994). Therefore, normalized PAVE cDNA libraries should have the 
same cDNA representation as the unnormalized primary library, 
minimizing the chance of losing some cDNAs that are selected against 
during amplification. 

IV. An ES cell line library? 

If we succeed in constructing normalized PAVE cDNA library with 
more than 95% of the inserts being full-length and encoding a protein 
product by TNT assay, then we can design a special vector which can 
direct the recombination of the cDNA insert into a specific locus in the 
mouse genome. Linearized plasmid DNA prepared from the library will be 



WO 00/56913 



19 



PCT/USOO/07332 



used to transfect ES cells. The ES clones containing individual cDNA 
inserts at the expected location will be isolated and the identity of the 
cDNA analyzed by PCR and sequencing. Eventually, we should be able to 
establish an ES cell line library for convenient transgenic mice production. 
This is opppsite to the Merck-Lexicon approach, where ES cell lines with 
disrupted genes are collected for production of knock-out mice, but maybe 
more relevant to the drug-discovery scenario, since most drugs are 
inhibitors to a disease target. 
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Tagging of mRNA 

**»**Do all RNA set-up in tissue culture hood******** 

** Do the following in silconized RNASE-FREE 15 ml tubes (Ambion). 

ALL reagents are made in DEPC-WATER (Ambion). 

Use only ART tips for all reactions. 

Clean pipettes with RNASE AWAY and EtOE. 

Place a new piece of lab paper on your bench (plastic side up). 

Wear gloves at all times:!!! 
IN GENERAL, CLEAN UP YOUR WORK AREA!!!!!!! 
(RNASES are EVERYWHERE.) 

DAY ONE: 

Today: We are using 0.24-9-SKB markers (lug/Ml), TF-1 mRNA (ljig/p.1) & Globin 
mRNA (1 ug/ul) 

Turn the heating block on to 37 °C. 



1 pJ tRNA (5|ig/uJ) 

36ul/39 ul DEPC-water (Ambion) 
5 jul 1 OX BAP Buffer (Gibco) 

0.75 ul 0.1 M DTT (Homemade-Sigma) 
1.25 ul RNAsin (40u/ui) (Promega) 
5ul/2pJ mRNA (lug/ill) (2ug) ~ 

1 ul BAP (150u/Ui) (Gibco) 

V T = 50 ul 

* Incubate at 37 °C for 0.5 hour on a heating block with cover (pipette box top). If there 

is condensation, then do a quick spin. 
* Add lOOui of DEPC-water then add 150 ul of phenol/CHCWIAA pH 7.9 (Ambion) 
and "flick" for 0.5 min. Spin for 4-6 minute in microcentrifuge at 14,000 rpm. 
Remove 125 pJ aqueous layer with pipette (TOP) and place into new 1.5 ml 
RNASE-FREE tube. 

* Add 125 ill of DEPC-water (Ambion) to the original tube (bottom) ) and "flick" for 30 

seconds. Spin for 4-6 minute in microcentrifuge at 14,000 rpm. Remove 125 Jil 
aqueous layer with pipette (TOP) and place with the other aqueous layer in the 1.5 
ml RNASE-FREE tube. 

* Add 25 ul 3M NaOac, pH 4.5 (Autoclaved from media prep) and 625 ul of 100% 

EtOH. Incubate on dry ice for 5-8 minutes. 

* Spin for 10-15 minutes at 4 °C at 14,000 rpm. Remove and SAVE (in a 1.5 ml 

RNASE-FREE tube) all of the EtOH layer except approximately 50 jjlL Spin as 
above for 5 minutes. Remove the remaining EtOH without disrupting the pellet. 
Wash pellet with 200 ul of 80% EtOH chilled at -20 °C and spin for 2-5 minutes 
at 4 °C at 14,000 rpm. Remove EtOH and again spin for 1 minute at 14,000 RPM 
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and remove the remaining 1-5 jil of EtOH by just touching a 20 pJL pipette tip to 
the edge of the drop of EtOH. Air dry with lids open on ice for 5 minutes. 

Resuspend in 20 \il DEPC-Water (Ambion) (100 ng/ul) 
— •** Save 500 ng (5 \H) of RNA markers only 



1 111 tRNA (5|ig/nl) 

21.7 ul/26.7 pi DEPC-water (Ambion) 
o nl 10X TAP buffer (Epicenter) 

1.3 nl RNAsin (Promega) 

20uJ/15fil "BAP-ed" mRNA 

UU TAP (lOu/ui) (Epicenter) 

Vt= 50 Hi 



Incubate at 37 °C for 0.5 hour on a heating block with cover (pipette box top). If 
there is condensation, then do a quick spin. 
* Add 150 Hi water. Add 150 Hi of phenoi/CHCyiAA pH 7.9 (Ambion) and "flick" for 
30 seconds. Spin for 4-6 minute in microcentrifuge at 14,000 rpm. Remove 125 
Hi aqueous layer with pipette (TOP) and place into new 1.5 ml RNASE-FREE 
tube. 

Add 125 Hi of DEPC-water (Ambion) to the original tube (bottom) ) and "flick" for 30 
seconds. Spin for 4-6 minute in microcentrifuge at 14,000 rpm. Remove 125 ul 
aqueous layer with pipette (TOP) and place with the other aqueous layer in the 1 5 
ml RNASE-FREE tube. 

Add 25 Hi 3M NaOAc, pH 4.5 (Autoclavcd from media prep) and 625 Hi of 100% 
EtOH. Incubate on dry ice for 5-8 minutes. 

Spin for 10-15 minutes at 4 "C at 14,000 rpm. Remove and SAVE (in a 1 5 ml 
RNASE-FREE tube) all of the EtOH layer except approximately 50 Hi. Spin as 
above for 5 minutes. Remove the remaining EtOH without disrupting the pellet. 
Wash pellet with 400 Hi of 80% EtOH chilled and spin for 2-5 minutes at 4 °C at 
14,000 rpm. Remove EtOH and again spin for 1 minute at 14,000 RPM and 
remove the remaining 1-5 ui of EtOH by just touching a 20 Hi pipette dp to the 
edge of the drop of EtOH. Air dry with lids open on ice for 5 minutes 
Resuspend in 20 ui DEPC-Wacer (Ambion) (75ng/Hl) 
Save 500 ng (6.7 Hi) of RNA narkers only 
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** Ligase Buffer: 0.25 M Tris pH7, 0.25 M Tris pH8 } 0.1M MgCl 2 (ALL Ambion 
Solutions) 

** You have approximately 2 ug to ligate at this point. 
** (1) RNA Markers, (2) Globin, (3) TF-1 mRNA 



lul tRNA 
56.95fil58 jxl/ 64.7 pi DEPC-watcr 



<5ug/ul) 



10 ul 
lul 
2.5 ul 
1.8 ul 
1.75U1/0.7 ui/0.7|il 
20*1/20 ul/ 13.3 ul 
5 p.1 

v T = loo ul 



10X NEW Ligase Buffer 
1M DTT 

RNAsin (40u/Ul) 
FRESH 10 mM ATF 

RNA-TAG 
TAP-treated mRNA 



(Ambion) 

(HOMEMADE-see recipe) 
(HOMEMADE-see recipe) 
(Promega) 
(Gibco-BRL) 
(lOOpmol/ul) (IDT) 
(2 ug) (ABOVE reaction) 



_T4 RNA Ligase 



(5u/ul) 



(GIBCO-BRL) 



Incubate at 16°C for 16 hours (overnight). 



* Add 50 ul of DEPC-water. Add 150 ul of phenol/CHCL/IAA pH 7.9 (Ambion) and 

"flick" for 30 seconds. Spin for 4-6 minuie in microcentrifuge at 14,000 rpm. 
Remove 125 ul aqueous layer with pipette (TOP) and place into new 1.5 ml 
RNASE-FREE tube. 

* Add 125 Ul of DEPC-water (Ambion) to the original tube (bottom) ) and "flick" for 30 

seconds. Spin for 4-6 minuie in microcentrifuge at 14,000 rpm. Remove 125 ul 
aqueous layer with pipene (TOP) and place with the other aqueous layer in the 1.5 
ml RNASE-FREE tube. 

* Add 25 ul 3M NaOAc, pH 4.5 (Autoclaved from media prep) and 625 ul of 100% 

EtOH. Incubate on dry ice for 5-8 minutes. 

* Spin for 10-15 minutes at 4 °C at 14,000 rpm. Remove and SAVE (in a 1.5 ml 

RNASE-FREE tube) all of the EtOH layer except approximately 50 ul. Spin as 
above for 5 minutes. Remove the remaining EtOH without disrupting the pellet 
Wash pellet with 400 ul of 80% EtOH chilled and spin for 2-5 minutes at 4 °C at 
14,000 rpm. Remove EtOH and again spin for 1 minute at 14,000 RPM and 
remove the remaining 1-5 ul of EtOH by just touching a 20 ul pipette tip to the 
edge of the drop of EtOH. Air dry with lids open on ice for 5 minutes. 

* Resuspend in 4 ul DEPC-Water (Ambion) (250 ng/ul) (markers), (500 ng/Ul) 
(mRNA) 

* SAVE 500 ng (2 ul) RNA markers 
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DAY TWO; 

**+Continue with 2\ig and 5(xl of TF-1 mBNA (for biotm-capture) 

1st Strand SvnthP«ta 

**Add components in the order they are listed. 



1 fit i T 


1.0 


Ml 


tRNA 




1.0 


Mi 


DEPC- treated water 




4.0 


Ml 


5X 1st Strand Buffer 




2.0 


Ml 


lOOmM DTT 




0.5 


Ml 


20mM dNTPs (fresh) 


4.7 Ml 


3.7 


Ml 


pED4 NT35 (8/14/98, 300 ng/ul) total 1.1 Mg 




0.5 


Ml 


RNAsin 




4.0 


Ml 


Globin mRNA( total lMg)/MG63 mRNA (total 2pg) 




2.0 


Ml 


Superscript II (Gibco-BRL) 




1.3 


Ml 


ThermoscTipt RT 


v T 


-20 Ml 







* Incubate at 48°C for 1 hour, 55 °C for 30 minutes 

* Add 130 ul of water and 150 Ml of phenol/CHCyiAA pH 7.9 (Ambion) and "flick" for 

0.5 min. Spin for 4-6 minute in microcentrifuge at 14,000 rpm. Remove 125 Ml 
aqueous layer with pipette (TOP) and place into new 1.5 ml RNASE-FREE tube. 

* Add 125 Ml of DEPC-waier (Ambion) to the original tube (bonom) ) and "flick" for 30 

seconds. Spin for 4-6 minute in microcentrifuge at 14,000 rpm. Remove 125 Ml 
aqueous layer with pipette (TOP) and place with the other aqueous layer in the 1.5 
ml RNASE-FREE tube. 

* Add 25 ul 3M NaOac, pH 4.5 (Autoclaved irom media prep) and 625 Ml of 100% 

EtOH. Incubate on dry ice for 5-8 mmutes. 

* Spin for 10-15 minutes at 4 °C at 14,000 rpm. Remove and SAVE (in a 1.5 ml 

RNASE-FREE tube) all of the EtOH layer except approximately 50 ul. Spin as 
above for 5 minutes. Remove the remaining EtOH without disrupting the pellet 
Wash pellet with 400 ul of 80% EtOH chilled at -20 °C and spin for 2-5 minutes 
at 4 °C at 14,000 rpm. Remove EtOH and again spin for 1 minute at 14,000 RPM 
and remove the remaining 1-5 ul of EtOH by just touching a 20 Mi pipette tip to 
the edge of the drop of EtOH. Air dry with lids open on ice for 5 minutes. 

Resuspend in 51.5ul of DEPC-treated water**** 

0,8% TBE AyarmP 



***Use only depyrogenated glassware to make the buffer and the gel. 
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** Wash your gel box and casting tray with RNASE AWAY. 

* Make IX TBE Buffer, by adding 1 10 ml of 10X TBE to 1 L of sterile milli -Q water. 

You may need to make 2 bottles, depending on Che size of your gel 

* Using a depyrogenated graduated cylinder measure 120 ml of IX TBE buffer and pour 

it into a 500 ml depyrogenated flask. Measure out 1 g of ultra-pure agarose (BI 
101) by shaking it into a weigh boai. Add the agarose to the buffer in the flask 
and swirl. 

* Heat the agarose approximately 1.5 minutes in a microwave, or until the agarose is 

clear. Allow it to cool until you can touch it with your bare hands without it 
burning, approximately 10 minutes. Add 10 ]H of 10 mg/ml ethidium bromide, 
swirl and pour it into a casting tray, Add comb to the gel and remove all bubble 
with a pipet tip. 

* Wait until it is completely solidified, approximately 20 minutes. In the meantime, add 

Gel Loading Buffer II (Ambion) in equal volume with your saved samples from 
the previous three reactions. (Example: if you saved 1 fjl then you add 1 pi of 
dye.) You should have 3 sample of RNA markers at after various reactions. 
Also, add 0.5 \xl of 0.24-9.5 KB RNA Ladder (Gibco-BRL) with 2 pi of water and 
2 fil of dye for your gel marker. 

* Heat 200 ml of sterile milli-Q water in a 500 ml beaker in the microwave until it boils 

or set up a 80 °C hear block. Place your gel sample with dye into the water for 5 
minutes at 80 °C Then place .them directly on to ice, until you are ready to load 
them onto the gel. 

* Once the gel is hardened place it into the buffer chamber and add buffer to cover it 

Load your sample onto the gel. Run the gel at 100 volts for approximately 1 hour, 
or until the first dye line reaches 2/3ths of the length of the gel. Stop the gel and 
take a picture. 

* You may have lost some mRNA as you progressed through each reaction, show by the 

decrease in intensity of the stained mRNA.; HOWEVER, the mRNA should all be 
the same size on the gel. If degradation has occurred, there will be a downshift in 
the size of the mRNA as the process progressed. 

52.0pJ 51.5 \il 

6.0 Hi 

2.0 fil 

0.5 ^ 



V T = 60^1 

Incubate at 37°C for 60 minutes 
****STOP the 5 jig cDNA Library* ***** 



cDNA (1.1 \ig) 

1 OX NEB buffer #2 

RNase One (Promega, 10 L7|jd) 

Kcoti RNAse H (Epicenter) (lOu/pi) 
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* JCB Annealing Buffer = 30mM Tris pH 8, 10 mM Mgd,, 300 mM NaCl (made with 

Ambion Solutions) 

60 ul previous Rxn 

30 ul DEPC-water 

10 m 10X JCB Annealing Buffer 

Vt= 100uJ 

Heat to 80 °C for 5 min , remove heating clock and cool until the 
temperture reaches 37 °C (for 30 minutes). 

EtOH precip with glycogen 
Resuspend in 10 ul 0.5X TE (1 10ng/ ul) 



2nd Strand ^th^c 

2 ul 
3.6 ul 
10 ul 
0.5 ul 
0.9 ul 

3 ul 



V 7 = 20 ul 
Incubate at 37 °C for 3-5 minutes 



Transformation 

1 Ml (2 nd ) 2nd strand reactions (11 ng) *dtiuted (1:5) 

40 ul Hectromax DH10B £ coll 



V T *41ul 

Electropore the transformation reaction at 1.8 volts. 

Add 1 ml of SOC media the the cells and transfer to a culture tube 

Grow for 1 hour at 37 C C 

Plate on to LB + 100 mcg/ml AMP plates (LARGE)-- 50 ui & 200 ul 
Grow around 16 hours 



10XT7 Buffer 
Water 

Annealed cDNA (1.1 jig) 
20 mM dNTPs (Epicenter) 
BSA (1 mg/ml) (NEB) 
T7 DNA polymerase dilute to (3 Vnits/yl) (NEB) 
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Dav Three & Four 

*** Count the colonies and calculate the titer (cfu/|ig) 
Cffltttting for Mini-Preps 

• HE a 96-deep well culture dish with 1 ml of TB with AMP (lOOug/ml) 

• Pick a single colony using a toothpick and place ic into one well. Continue until 
all wells are inoculated Remove the toothpicks and cover air pore tape. Grow at 
least 16 hour overnight (up to 24 hours). 

Mjflj-Prepg fQfrgep) 

• Spin down plate at 4000 rpm for 10 minutes (Program #7). 

• Check for pellet and then pour out media. 

• Continue following Qiagen 96-well Turbo Mini-prep protocol 

Digests 

• Use an U-shaped 96-well culture plate for digests. 

• For 105 Rxn at 15 \iU reaction 



210 ]il 


2)Ll 


Buffer #3 




5]xl 


piasmid 


1218 |il 


11.6 tU 


milli-Q water 


63 .ul 


0.6 


Xhol 


63 JU 


0.6 ]il 


PstI 


21lll 


0.2 til 


100XBSA 


V T =1575 \il 


V T = 20 |ll 





Incubate at 37 °C for 2 hours 

Add 3 (il 6X loading dye 

Rim on gel at 250 volts for 1.5- 2 hours 

Stain gel for 10-15 minutes 
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Table 1 shows the results of making a cDNA library of rabbit globin mRNA using 
the PAVE method of the present invention. 

5 

Table 2 shows the results of making cDNA libraries from a variety of mRNA 
sources using both "conventional" methods and the PAVE method of the present 
invention. The "conventional" method employed a kit obtained from GIBCO/BRL and 
utilized a 3' oligo-dT primer and Saul adaptors. 

10 

Table 3 shows a number of parameters of the T4 RNA ligase reaction that may be 
modified to obtain optimal efficiency of the reaction. The most preferred reaction 
conditions include performing the reaction at room temperature overnight (or 16 hours); 
using an acceptor/donor ratio that is the same as that obtained from reacting 2 ng mRNA 
15 (average size 1.5 kb) with 175 pmoles of a 27-residue RNA tag; and performing the 
reaction in RNAse-free Tris MgCl 2 buffer with tRNA, DTT, and 5.8nM ATP added. 
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Table 1. Analysis of cDNA library made from rabbit globin mRNA 





Number of Colonies 


Percentage 


Total Positives a 


385 


1 00% 


Full-length b 


292 


75.8% 


3'-only c 


75 


19.5% 


5'-only d 


18 


4.7% 



a. Duplicate filters were lifted from one plate and hybridized to 
two labeled oligonucleotide probes complementary to 5' and 
3' ends of rabbit p-globin mRNA. The total positives were 
counted. 

b. Full-length clones were double positives to 5' and 3' probes. 

c. Clones hybridized only to 3'-end probes. 

d. Clones hybridized only to 5'-end probes. 
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T*BLE 3 



Qp&mfeation of RfrfA-RNA ligation by T4 RNA ligase 

1. Effect of Temperature: 4 °C, O/N; 16 °C O/N; Room Temperamre 

O/N; 37 °C, O/N; 37 °C, 3 hrs 

2. Time Courses at Suitable Temperature: 0.5, 2, 4, 8, 16, 24 hrs 

3. Effect of Denaturants: DMSO: 10%, 20%, 30%, 40% 

Urea: 0.5 M, 1M, 2M, 3M, 4M 
Fonnamide: 5%, 10%, 20%, 40% 

4. Effect of Accepter/Dooac Ratio: 1. 10, 20, 50, 100, WO 

5. Effect of PE«S.: 5>%, U0<&, 15*, 20%, 25% 

6. Effect of StrffeiB (?): GH^y^ycwe, HEPES or Tids 

7. Effect <rf tor^ic P^pM^lotase <PPi is isfiMftift*$, teat $i is not!!) 
S. Efifeot of HlGG(fcex-amine ooJBSfc <Atorkk>: 0 J <aaM, 1 mM, 2 mM, 5 

9. mfeot of SdBgle-Stianded 'RSiA. BMSaf Proteins (U. T4 gene 3£ 
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What is claimed is: 

1. A method for preparing a modified mRNA molecule which comprises 
ligating a tag comprising at least one ribonucleotide residue to the 5' end of one or more 
mRNA molecules, wherein the tag does not contain deoxyribonucleotdde residues. 

2. The method of claim 1 further comprising a prior step of treating at least 
one mRNA molecule with pyrophosphatase so that the 7-methylguanosine (7mG) cap is 
removed from the 5' end of at least one mRNA molecule. 

3. The method of claim 2 wherein the pyrophosphatase is tobacco acid 
pyrophosphatase. 

4. The method of claim 1 further comprising a prior step of treating at least 
one mRNA molecule with phosphatase so that the 5' phosphate is removed from at least 
one mRNA molecule not having a 7-methylguanosine (7mG) cap. 

5. The method of claim 4 wherein the phosphatase is selected from the group 
consisting of HK phosphatase and BA phosphatase. 

6. The method of claim 1 wherein the tag further comprises a biotin residue. 

7. The method of claim 1 wherein the tag has the following ribonucleotide 
sequence: 5'-ACUAGUGACCAGCUGAUACGCCUCAAA-3' 

8. The method of claim 1 wherein the ligation reaction is performed using T4 
RNA ligase. 

9. The method of claim 1 wherein the ligation reaction is performed at room 
temperature overnight. 

10. The method of claim 1 wherein the ligation reaction is performed in the 
presence of tRNA molecules. 
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1 1 . The method of claim 1 wherein the ligation reaction is performed in an ATP 
concentration selected from the group consisting of: 2 nM, 3 nM, 4 nM, 4.5 nM, 5 nM, 5.5 

nM, 5.8 nM, 6 nM, 6.5 ruM, 7 nM, 7.5 nM, 8 nM, 9 nM, and 10 nM. 

12. The method of claim 11 wherein the ATP concentration is 5.8 nM. 

13. A modified mRNA molecule produced according to the method of claim 

1. 

14. A method for preparing at least one vector-primer molecule which 
comprises contacting at least one primer with at least one vector molecule so that at least 
one complementary base-pair is formed between the primer and the vector molecule. 

15. The method of claim 14 wherein the vector is selected from the group 
consisting of pED6dpc2, pED6dpc4, pNOTs, and pAVEl. 

16. The method of claim 14 wherein at least one primer has a nucleotide 
sequence selected from those shown as "3' linker" and "5' linker" in Figures 7 and 8. 

17. The method of claim 14 further comprising a subsequent step of ligating 
at least one primer to at least one vector molecule. 

18. The method of claim 17 wherein the ligation reaction is performed with T4 
DNA ligase. 

19. A vector-primer molecule produced according to the method of claim 14. 

20. A method for preparing a cDNA library comprising the steps of: 

(a) ligating a tag comprising at least one ribonucleotide residue to the 
5' end of one or more mRNA molecules, wherein the tag does not contain 
deoxyribonucleotide residues; 

(b) contacting the products of step (a) with a vector-primer molecule 
so that at least one complementary base-pair is formed between at least one 
product of step (a) and the vector-primer molecule. 
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21 . The method of claim 20 further comprising a subsequent RNAse digestion 

step. 

22. The method of claim 20 further comprising a subsequent DN A polymerase 
second-strand synthesis step. 

23. The method of claim 22 wherein the DNA polymerase is selected from the 
group consisting of T4, T7, Pfu, and SEQUENASE DNA polymerases. 

24. The method of claim 22 wherein the DNA polymerase reaction is 
perfromed for a time period selected from the group consisting of: 1 minute, 2.5 minutes, 
5 minutes, 7.5 minutes, 10 minutes, 20 minutes, 30 minutes, or 60 minutes. 

25. The method of claim 24 wherein the DNA polymerase reaction is 
performed for 5 minutes. 

26. The method of claim 20 further comprising a subsequent step comprising 
transforming host cells with the products of step (b) of claim 20. 

27. The method of claim 26 wherein the host cells are transformed with the 
products of step (b) of claim 20 without a DNA polymerase second-strand synthesis step 
having been performed. 

28. The method of claim 26 wherein the host cells are transformed with the 
products of step (b) of claim 20 without a DNA ligase step having been performed. 

29. A cDNA library comprising cDNA molecules produced according to the 
method of claim 20. 

30. The method of claim 20, wherein the mRNA molecules are human mRNA 
molecules. 

31. The method of claim 20, wherein the mRNA molecules are mammalian 
mRNA molecules. 
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32. The method of claim 20, wherein the mRNA molecules are mRNA 
molecules extracted from a species of plant. 



33. The cDNA library of claim 29, wherein the mRNA molecules of claim 20 
are human mRNA molecules. 
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Labeling of Full-length mRNA With An RNA Tag 




(A)n-OH 3* 
(A)n-OH 3* 
(A)n-OH 3* 
(A)n-OH 3* 
(A)n-OH 3* 
(A)n-OH 3* 
(A)n-OH 3' 
(A)n-OH 3* 

(A)n-OH 3* 
,(A)n-OH 3' 



(A)n-OH 3' 
(A)n-OH 3' 
(A)n-OH 3' 
(A)n-OH 3* 
(A)n-OH 3 f 
(A)n-OH 3' 
(A)n-OH 3' 
(A)n-OH 3' 

(A)n-OH 3' 
,(A)n-OH 3* 



Tobacco Acid Pyrophosphatase (TAP) 




RNA Tag 



T4 RNA ligase 




,(A)n-OH 3* 
(A)n-OH 3' 
(A)n-OH 3* 
,(A)n-OH 3* 
(A)n-OH 3* 
,(A)n-OH 3' 
(A)n-OH 3* 
(A)n-OH 3* 

(A)n-OH 3* 
,(A)n-OH 3' 



(RNA Tag: BI0T1N-5^ACUAGUGACCAGCUGAUACGCCUCAAA■3 , ) 
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P.ESTRICTION AND FUNCTIONAL MAP 
OF THE p£D6dpc2 EXPRESSION PLASM ID 




BamHt 2772 Oal 2666 



Plasmid Name: pED6dpc2 
Plasmid Size: 5374 bp 

Comments: The origin, function, and position of the various elements of the p£D6dpc2 expression plasmid arc provided 

below. The various nucleotide (nt) positions within the plasmid are given relative to the 5' end of the SV40 
enhancer segment, the first nt of which was assigned as Position 1. 

Discover Easc^ cDNAs arc cloned between EcoRI and Nod. 

Gal, Nhel. Sapl. and Ndci arc unique sites in the expression plasmid. 

SV4Q enhancer (nt 1-345): This fragment originated from the S V40 genome. U contains the SV4Q origin of replication and 
transcriptional enhancer. The S V40 enhancer sequence increases the level of transcription from the adenovirus 2 (Ad2) major 
la'tc promoter. 

Ad2 MLP (nt 364-656): This fragment contains the Ad2 major late promoter (MLP) from Xhol to Pvull. 

Ad2 TPL (nt 657-796): This fragment represents a cDNA copy of the majority of the tripartite leader present on all late Ad2 
mRNAs. 

Hybrid intron (nt 797-1059): The hybrid intervening sequence contains a 5' splice from the Adenovirus tripartite leader and a 3' 
splice from a murine IgG gene. 

Polylinkcr (at IQS9-109 3): The DiscoverEase™ cDN As arc cloned into the EcoRI-Notl site. The 5' end of the cDNAs contains 
aSfUsiie. 

EMCV Leader (nt U04-1649): This sequence is derived from the cnccphaiomyocarditis virus (EMCV) RNA. This sequence 
allows ribosomes to initiate translation internally, resulting in a more efficient translation of the DHFR gene. 

Mouse DHFR cDNA (nt 1650-2317): A selectable marker in Chinese hamster ovary cells. 

SV40 poiyadenylaUoa site (nt 2318-2550): This fragment contains the poiyadenylauon site from the SV40 early region. 

Ad2 VAI gene (at 2551-2905): This fragment is derived from the Ad2 genome and encodes the virus-assodaxed RNA I. 

pOC 19 backhottc (at 2906-S374): This fragment includes the Col El origin of replication which allows replication of the 
plasmid in £ coft, and the bcta-lactamase gene (nt 3913-4708) which confers ampiciUin resistance and is used as a selectable 
marker in the propagation of the plasmid in £ coti. 
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The following is the sequence alignment of pED6dpc2 and 
pED6dpc4 . 

dpc2 1 AAGCTTTTTGCAAAAGCCTAGGCCTCCAAAAAAGCCTCCTCACTACTTCT 50 

1 1 1 1 1 1 III 1 1 1 ! I M ill! I MMIII! I II I IN MM II 1 1 III I II 

dpc4 1 AAGCTTTTTGCAAAAGCCTAGGCCTCCAAAAAAGCCTCCTCACTACTTCT 50 
51 GG AATAGCTC AG AGGCCGAGGCGGCCTCGGCCTCTGC AT AAATAAAAAAA 100 

M M M MM M M M M Ml MMMM M M H i MM II M M MM 

51 GGAATAGCTCAGAGGCCGAGGCGGCCTCGGCCTCTGCATAAATAAAAAAA 100 
101 ATTAGTCAGCCATGGGGCGGAGAATGGGCGGAACTGGGCGGAGTTAGGGG 150 

M I M MM 1 1 1 II M MMMMMMI MM MMIIMI MIM III 

101 ATTAGTCAGCCATGGGGCGGAGAATGGGCGGAACTGGGCGGAGTTAGGGG 150 
151 CGGGATGGGCGGAGTTAGGGGCGGGACTATGGTTGCTGACTAATTGAGAT 200 

M I i 1 1 M ! I M I M I ! M I ! 1 1 ! I M M I ! 1 1 1 1 1 1 1 M I i II II 1 1 1 1 

151 CGGGATGGGCGGAGTTAGGGGCGGGACTATGGTTGCTGACTAATTGAGAT 200 
201 GCATGCTTTGCATACTTCTGCCTGCTGGGG AGCCTGGGGACTTTCCACAC 250 

M I IN HIM M Ml Mil (III II 111 IN [III 1 1 111 1 1 (1 1 1 III 

201 GCATGCTTTGCATACTTCTGCCTGCTGGGG AGCCTGGGGACTTTCCACAC 250 
251 CTGGTTGCTGACTAATTGAGATGCATGCTTTGCATACTTCTGCCTGCTGG 3 00 

M M 1 1 Ml 1 1! 1 1 1 1 MM MMIII II MM MIM M II 1 1 III III 

251 CTGGTTGCTGACTAATTGAGATGCATGCTTTGCATACTTCTGCCTGCTGG 300 
301 GGAGCCTGGGGACTTTCCACACCCTAACTGACACACATTCCACAGGATCC 350 

IMM MMI I MMI II MM MMMMMMI MIIM MM MMMM 

3 01 GGAGCCTGGGGACTTTCCACACCCTAACTGACACACATTCCACAGGATCC 350 
3 51 GGTCGCGCGAATTTCGAGCGGTGTTCCGCGGTCCTCCTCGTATAGAAACT 400 

M M 1 1 1! 1 1 1 1 1 M 1 1 II 11 1 ! I i 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 

351 GGTCGCGCGAATTTCGAGCGGTGTTCCGCGGTCCTCCTCGTATAGAAACT 400 
401 CGGACCACTCTGAGACGAAGGCTCGCGTCCAGGCCAGCACGAAGGAGGCT 450 

MMI Mill II MM MMMMMMI IMM I MMMM Ml MM 

401 CGGACCACTCTGAGACGAAGGCTCGCGTCCAGGCCAGCACGAAGGAGGCT 450 
451 AAGTGGGAGGGGTAGCGGTCGTTGTCCACTAGGGGGTCCACTCGCTCCAG 500 

MM II M M I M M M (MMiMMMM 1 II M MM! MM ! M M M 

451 AAGTGGGAGGGGTAGCGGTCGTTGTCCACTAGGGGGTCCACTCGCTCCAG 500 
501 GGTGTGAAGACACATGTCGCCCTCTTCGGCATCAAGGAAGGTGATTGGTT 550 

M Ml II II I II II il IMIMMIMM II Mill I II 1 1 M 1 1 1 1 III 

501 GGTGTGAAGACACATGTCGCCCTCTTCGGCATCAAGGAAGGTGATTGGAA 550 
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551 TATAGGTGTAGGCCACGTC-ACCGGGTGTTCCTGAAGGGGGGCTATAAAAG 600 

I M M Mill Mill i 1 1 M I , MM III Ml I 

5 51 TATAGGTGTAC-GCC ACGTG ACCC-GGTGTTCCTGAAGGGGGGCTATAAAAG 600 

601 GGGGTGGGGGCGCGTTCGTCCTCACTCTCTTCCGCATCGCTGTCTGCC-AG 650 

I I I I I I 1 I I I I f f I I i I I i I f I I I 1 I I I I I I I I 1 I f I I I I 1 I I I I 

601 GGGGTGGGGGCGCGTTCGTCCTCACTCTCTTCCGCATCGCTGTCTGCGAG 650 

6 51 GGCCAGCTGTTGGGCTCGCGGTTGAGGACAAACTCTTCGCGGTCTTTCCA 700 

IIIIIMIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIII 

651 GGCCAGCTGTTGGGCTCGCGGTTGAGGACAAACTCTTCGCGGTCTTTCCA 700 

7 01 GTACTCTTGGATCGGAAACCCGTCGGCCTCCGAACGGTACTCCGCCACCG 7 50 

I i 1 1! IN!; II 1 1 M 1 1 1 1 1 1 1 Ml III I M II II Ml M 1 1 1 II I ! 1 1 

701 GTACTCTTGGATCGGAAACCCGTCGGCCTCCGAACGGTACTCCGCCACCG 750 
751 AGGGACCTGAGCGAGTCCGCATCGACCGGATCGGAAAACCTCTCGACTGT 800 

I Ml Ml M Ml Ml 1 1 ! M M IM I II I MM IMM I M 1 1 M 1 1 1 1 

7 51 AGGGACCTGAGCGAGTCCGCATCGACCGGATCGG'AAAACCTCTCGACTGT 800 

8 01 TGGGGTGAGTACTCCCTCTCAJL^GCGGGCATGACTTCTGCGCTAAGATT 850 

M I IMIIII M I Ml M M Ml M M 1 1 1 1 1 II II II 1 1 I M Ml I M I 

801 TGGGGTGAGTACTCCCTCTCAJOAGCGGGC ATGACTTCTGCGCTAAGATT 850 
851 GTCAGTTTCCAAJ^CGAGGAGGATTTGATATTCACCTGGCCCGCGGTGA 900 

II III IMIIII I Mill II MMulM Ml IMIIII 1 1 III II 1 1 M 

8 51 GTCAGTTTCCAAAAACGAGGAGGA.TTTGATATTCACCTGGCCCGCGGTGA 900 
901 TGCCTTTGAGGGTGGCCGCGTCCATCTGGTCAGAAAAGACAATCTTTTTG 9 50 

II Ml IMIIIIIIIIIIIMIIIIIIII MM 

901 TGCCTTTGAGGGTGGCCGCGTCCATCTGGTCAGAAAAGACAATCTTTTTG 950 
951 TTGTCAAGCTTGAGGTGTGGCAGGCTTGAGATCTGGCCATACACTTGAGT 1000 

1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 I I Mill innn 

9 51 TTGTCAAGCTTGAGGTGTGGCAGGCTTGAGATCTGGCCATACACTTGAGT 1000 
10 01 GACAATGACATCCACTTTGCCTTTCTCTCCACAGGTGTCCACTCCCAGGT 1050 

II IM IMIIMMM MM MIMMIMM MM III I III I MMM 

1001 GACAATGACATCCACTTTGCCTTTCTCTCCACAGGTGTCCACTCCCAGGT 1050 
1051 CCAACTGCA Gact 1° 63 

M 1 1 1 II 1 1 II I linn 

1051 CCAACTGCAGGCCGGCCtctaatacgactcactiatagGGCGCGCCtgaat: 1100 
10 64 tcGAATTCt 1072 

nn mi 

1101 tcGATATC 1 1 aagCCCGGG cacGTCGACgcggc cgcGCGATCGC c c 1 1: C a 1150 
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1073 acCgaCTCGAGactccattGCGGCCGCaattctaacgtta 1112 

I I lllllll II I I IIIIIIIIIIIIIMMM 

1151 gegagggTTAATTAActcgacTCTAGAccggGGCCGCaattctaacgtca 1200 
1113 ccggccgaagccgcttgaaacaaggccggcgcgcgtuCgtctacacgcta 1162 

IIIIIIIIIIIIIIIillllllMlilMIIIIIIIIIIIillllllMI 

1201 ctggccgaagccgcttggaataaggccggtgcgcgcttgtctatatgtta 1250 
1163 ttttccaccatattgccgtcctttggcaatgtgagggcccagaaacctgg 1212 

lllllllllllllllll III 1 1! IIIIIIIIIIIIIIIMMIIIIIII I 

1251 tttcccaccatattgccgtcruCtggcaatgtgagggcccggaaacccgg 1300 
1213 ccctgtcttcttgacgagcacccctaggggtctttcccctctcgccaaag 1262 

1 1 ! I II M I If j I i 1 1 1 1 1 M I ! M I i M 1 1 M I M 1 1 1 1 i 1 1 1 1 1 M 1 1 

13 01 ccctgtcttcttgacgagcaztcctaggggtctttcccctcticgccaaag 13 50 
1263 gaatgcaaggtctgctgaacgrcgtgaaggaagcagctcctctggaagcc 1312 

IIIMMIMIMMIMM ! II IMIIIIIIMMIMMI III III M 

13 51 gaatgcaaggtctgttgaargccgtgaaggaagcagttcctctggaagct 1400 
1313 tcttgaagacaaacaacgcccgcagcgaccctttgcaagcagccgaaccc 13 62 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

1401 tcttgaagacaaacaacgtccgtagcgaccccttgcaggcagcggaaccc 1450 
1363 cccacctggcgacaggngcc-ccgcggccaaaagccacgtgtataagata 1412 

IIIIIIIIIIMIIIIIIII I IIIIIIIIIIIIIIIMIIIIIM INI I 

1451 cccacctggcgacaggtgcc~ct:gcggccaaaagccacgtgtat:aagac:a 1500 
1413 cacctgcaaaggcgacacaaccccagtgccacgttgtgagttaaatagtc 1462 

lllllllllllllllllll II III IIIIIIIIIIIIIIIIIIMII III I 

1501 cacctgcaaaggcggcacaaccccagtgccacgttgtgagttggatagtt 1550 
1463 gtggaaagagtcaaatggctctcctcaagcgtattcaacaaggggctgaa 1512 

II II I lllllll III III! 1 1 III lllllllllllllll II II 1 1 1 II 1 1 

1551 gtggaaagagtcaaatggctctcctcaagcgtattcaacaaggggctgaa 1600 
1513 ggatgcccagaaggcaccccattgtatgagatctgatctggggcctcggt 1562 

MIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIM I III I 

1601 ggatgcccagaaggtaccccatcgtatgggatctgatctggggcctcggc. 1650 
1563 gcacatgctttacatgtgtttagtcgaggccaaaaaacgtccaggccccc 1612 

lllllllllllllllllllll IIIIIIIIMIMMIIIMIMMIII I 

1651 gcacatgctttacatgtgtccagtcgaggttaaaaaacgtctaggccccc 1700 
1613 cgaaccacggggacgtggttctcctttgaaaaacacgATgataatatcgc 1662 

IIIIIIIIIIIIIIIIMII! IIIUIIIIMMIMIHIIMIIH II 

1701 cgaaccacggggacgtggtccccctttgaaaaacacgATgataatattgc 17 50 
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1663 cacaaccatggcccgaccattgaactgcatcgtcgccgtgtccCAAAATA 1712 

IIIIIIIIIIIIIIIIIIIIIIIIMIMIIIMIIIMIMIIIIIIM 

17S1 cacaaccatggttcgaccattgaactgcatcgccgccgtgtccCAAAATA 1800 

1713 TGGGGATTGGCAAGAACGGAGACCTACCCTGGCCTCCGCTCAGGAACGAG 1762 

IIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIMIIIIIMIIMII 

1801 TGGGGATTGGCAAGAACGGAGACCTACCCTGGCCTCCGCTCAGGAACGAG 1850 

1763 TTC ^ GT \CTTC C AAAGAATGACCACAACCTCTTCAGTGGAAGGTAAACA 1812 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIimilMi 

1851 TTCAAGTACTTCC AAAC-AATGACCACAACCTCTTC AGTGGAAGGTAAAC A 1900 

1813 GAATCTGGTG ATTATGGGTAGGAAAACCTGGTTCTCC ATTCCTG AG AAG A 1862 

IIIIIIMNIIIIIIIMIIiMIIIIIIIIIIIIIIIIIIIIIMIII 

1901 GAATCTGGTGATTATGGGTAGGAAAACCTGGTTCTCCATTCCTGAGAAGA 1950 

1863 ATCGACCTTTAAAGGACAGAATTAATATAGTTCTCAGTAGAGAA.CTCAAA 1912 

IIIIMIhlMIIIMIMMIIIIIIMIIIMIIIIMIIIIMIII 

1951 ATCGACCTTTAAAGGACAGAATTAATATAGTTCTCAGTAGAGAACTCAAA 2000 

1913 GAACCACCACGAGGAGCTCATTTTCTTGCCAAAAGTTTGGATGATGCCTT 1962 

II I II III I II I II III! I IMIIII Mill H IN MM M MM Ml I 

2001 GAACCACCACGAGGAGCTCATTTTCTTGCCAAAAGTTTGGATGATGCCTT 2050 

1963 AAGACTT^TTGAACAACCGGAATTGGCAAGTAAAGTAGACATGGTTTGGA 2012 

iitiiiiiiiiiiuiiiiiiiiiiimiimMi in mm 9inn 

2051 AAGACTTATTGAACAACCGGAATTGGCAAGTAAAGTAGACATGGTTTGGA 2100 

2013 TAGTCGGAGGCAGTTCTGTTTACCAGGAAGCCATGAATCAACCAGGCCAC 2062 

1 1 M 1 1 1 1 1 1 1 1 1 II I M 1 1 1! M II II II I II I II 1 1 II 1 1 II II I II I 

2101 TAGTCGGAGGCAGTTCTGTTTACCAGGAAGCCATGAATCAACCAGGCCAC 2150 

2063 CTCAGACTCTTTGTGACAAGGATCATGCAGGAATTTGAAAGTGACACGTT 2112 

I II 1 1 II I II I M 1 1 1 ! II 1 1 M 1 1 1 II I M 1 1 1 II M II M I [M MM 

2151 CTCAGACTCTTTGTGACAAGGATCATGCAGGAATTTGAAAGTGACACGTT 2200 

2113 TTTCCCAGAAATTGATTTGGGGAAATATAAACTTCTCCCAGAATACCCAG 2162 

IIIIIIIMIIMIMIUIIIIIIMIIIIIIIIMIIIIIIIMMII 

2201 TTTCCCAGAAATTGATTTGGGGAAATATAAACTTCTCCCAGAATACCCAG 2250 

2163 GCGTCCTCTCTGAGGTCCAGGAGGAAAAAGGCATCAAGTATAAGTTTGAA 2212 

IIMIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIII oinn 

2251 GCGTCCTCTCTGAGGTCCAGGAGGAAAAAGGCATCAAGTATAAGTTTGAA 23 00 

2213 GTCTACGAGAAGAAAGACTAACAGGAAGATGCTTTCAAGTTCTCTGCTCC 2262 

IIIIIMIIIIIIIIIIIMIIIIIIIIMIMMIIIIIIIIMIIIII 

23 01 GTCTACGAGAAGAAAGACTAACAGGAAGATGCTTTCAAGTTCTCTGCTCC 2350 
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22 63 CCTCCT AAAGCT ATGC ATTTTTTATAAGACCATGGGACTTTTGCTGGCTT 2312 

III Ml I II 1 1 II I II III Mill IMIIMIIIMMii III Mlllll 

23 51 CCTCCT AAAGCT ATGC ATTTTTTATAAGACCATGGGACTTTTGCTGGCTT 2400 
2313 TAG ATCATAATCAGCCATACCACATTTGTAGAGGTTTTACTTGCTTT AAA 23 62 

MM III II I III MM III MM IMIIMMMIIMMII I II II II 

2401 TAGATCATAATCAGCCATACCACATTTGTAGAGGTTTTACTTGCTTTAAA 24 50 

23 63 AAACCTCCCACACCTCCCCCTGAACCTGAAACATAAAATGAATGCAATTG 2412 

MMIMM MM MMMMMMMIMMMMMMMMMMM 

2451 AAACCTCCCACACCTCCCCCTGAACCTGAAACATAAAATGAATGCAATTG 2 500 
2413 TTGTTGTTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAAT 24 62 

MMMMMMMMIMMMMMMMMMMMMMMMM! 

2501 TTGTTGTTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAAT 2550 

24 63 AGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTG 2512 

I III III II I Ml II M Ml MMMMIMMMIIMIMMM II II 

2551 AGCATC ACAAATTTCAC AAATAAAGC ATTTTTTTCACTGC ATTCT AGTTG 2600 
2513 TGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGGATCCCCGGCC 2562 

I ill IM I! II II MM IM MMMMilMIMMMMMM II I II 

2601 TGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGGATCCCCGGCC 2650 
2563 AACGGTCTGGTGACCCGGCTGCGAGAGCTCGGTGTACCTGAGACGCGAGT 2612 

1 1 II I II M II 1 1 1 1 II 1 1 M II M 1 1 1 1 1 M I II 1 1 1 1 1 M I II 1 1 M I 

2651 AACGGTCTGGTGACCCGGCTGCGAGAGCTCGGTGTACCTGAGACGCGAGT 27 00 
2613 AAGCCCTTGAGTCAAAGACGTAGTCGTTGCAAGTCCGCACCAGGTACTGA 2662 

MMMMMIMMMMMMMMMMMMMMMMM MMI 

2701 AAGCCCTTGAGTCAAAGACGTAGTCGTTGCAAGTCCGCACCAGGTACTGA 2750 
2663 TCATCGATGCTAG ACCGTGCAAAAGGAGAGCCTGTAAGCGGGCACTCTTC 2712 

MMM M MMMMMMMMMM IM MUM MMMM MMI 

2751 TC ATCGATGCTAGACCGTGCAAAAGGAGAGCCTGTAAGCGGGCACTCTTC 2 800 
2713 CGTGGTCTGGTGGATAAATTCGCAAGGGTATCATGGCGGACGACCGGGGT 2762 

MMMMM MMMMMMMMMIMMMMMMMM MMI 

2801 CGTGGTCTGGTGGATAAATTCGCAAGGGTATCATGGCGGACGACCGGGGT 2850 
2763 TCGAACCCCGGATCCGGCCGTCCGCCGTGATCCATCCGGTTACCGCCCGC 2812 

1 1 1 i 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 i M I 

2851 TCGAACCCCGGATCCGGCCGTCCGCCGTGATCCATCCGGTTACCGCCCGC 2900 
2813 GTGTCGAACCC AGGTGTGCGACGTCAGACAACGGGGGAGCGCTCCTTTTG 2862 

I III IM II 1 1 1 1 IMIMMMIMIMIMIII MIMM III II 1 1 

2901 GTGTCGAACCCAGGTGTGCGACGTCAGACAACGGGGGAGCGCTCCTTTTG 2950 
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2863 GCTTCCTTCCAGGCGCGGCGGCTGCTGCGCTAGCTTTTTTGGCGAGCTCG 2912 

II Mill ! II M 1 1 II 1 1 1 ! 1 1 i I IMII 1 1 Mi I ! li 1 1 1 II I Ml 1 1 1 

2951 GCTTCCTTCCAGGCGCGGCGGCTGCTGCGCTAGCTTTTTTGGCGAGCTCG 3 000 
2913 AATTAATTCTGC ATTAATG AATCGGCCAACGCGCGGGGAGAGGCGGTTTG 2 962 

Ml MM I II I II I III 1 1 M MM IMIMI I II I Ml II I M MM I M 

3 001 AATTAATTCTGC ATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTG 3 050 
2963 CGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGT 3 012 

MMMI III I IMIIIII II MIMMMIMM IMMIM Mill 1 1 

3 051 CGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGT 3100 
3013 CGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGT 3062 

MMMI III IMIIMMM IIIMIMM IMIIMM I MMMI II 

3101 CGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGT 3150 
3 063 TATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGC AAAAGGC 3112 

I IMIIMM MM MMMI MIMIIMI MMIMMIMM III 1 1 

3151 TATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGC 3200 
3113 CAGC AAAAGGCC AGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCA 3152 

I II MIM 1 1 II III II MM I MIMIIMI III Ml 1 1 1 1 1 1 II II 1 1 

3201 CAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCA 3 250 
3163 TAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGA 3 212 

MMMIIMI I Ml II! IMM MMMMIMMMIM I MIMI 1 1 

32 51 TAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGA 3 300 
3 213 GGTGGCGAAACCCGAC AGG ACTATAAAGATACC AGGCGTTTCCCCCTGG A 3 262 

! Ml MM Ml I II Ml I MM MIMIIMI Ml MM II 1 1 1 II M 1 1 

33 01 GGTGGCGAAAC CCG AC AGG ACTATAAAGATACCAGGCGTTTC CC CCTGGA 3 3 50 
3263 AGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCT 3312 

I MMMI I M I III Mill 1 1 MM MMMI II MM M II I II 1 1 1 1 

3351 AGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCT 3400 
3313 GTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCT 3 3 62 

MMMMIMMMMMMMMMMMMMMMMI MMMM 

3401 GTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCT 3450 
33 63 GTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTG 3412 

MMMM MMMMIMMMMMMM MMMM MIM M MM 

3451 GTAGGTATCTC AGTTCGGTGT AGGTCGTTCGCTCCAAGCTGGGCTGTGTG 3 500 
3413 CACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG 3462 

MMMMMMMMIMMMMMMMMMMM M MMMIM 

3501 CACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG 3550 
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3463 TCTTGAGTCCAACCCGGTAAGACACGACTTATCGCC ACTGGC AGC AGCC A 3 512 

IlilMIIIIMI 1 1 1 1 1 IIIMMI! II MIIIIIIIM I II I lllill 

3 551 TCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCA 3 600 
3 513 CTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAG AGTTC 3 562 

III I IIIMMI! 1 1 1 1 1 II IMIII I IIIMMII MM I M I Mill I 

3601 CTGGTAACAGGATTAGC AGAGCGAGGTATGTAGGCGGTGCTAC AGAGTTC 3 650 
3563 TTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTAT 3 612 

IIMIIIIIIMI I IN IMIMMMMMM MM II 1 1 M I III Ml 

3 651 TTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGAC AGTATTTGGTAT 3 700 
3613 CTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTT 3 662 

MM MMIIIMI II I IIMMIMI IIIIIMM MM I II I MM II 

3701 CTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTT 3750 
3 663 GATCCGGCAAAC AAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGC AAG 3712 

) 1 1 1 1 1 1 1 1 1 M 1 1 II f 1 1 1 1 i t i I ! 1 1 1 1 1 1 1 ! II 1 1 1 II M 1 1 1 1 1 1 ! 

3751 GATCCGGCAJLACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAG 3 800 
3713 CAGCAGATTACGCGCAGAAJLAAAAGGATCTCAAGAAGATCCTTTGATCTT 37 62 

MM MIIIIMM II I III I M MM IIIIIMIIIMM 1 1 1 1 II IM 

3801 CAGCAGATTA.CGCGCAGAAAJ i JL--AGGATCTCAAGAAGATCCTTTGATCTT 3 850 
3763 TTCT ACGGGGTCTG ACGCTC AGTGGAACG AAAACTC ACGTTAAGGG ATTT 3 812 

IMIMMIMIM II 1 1 M MMMIMMMMMI II I 1 1 M II I M 

3 851 TTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTT 3 900 
3 813 TGGTCATGAGATTATCAJVAAAGGATCTTCACCTAGATCCTTTTAAATTAA 3 8 62 

II II M 1 1 1 II I II II 1 1 1 1 II I M II II 1 1 II II 1 1 1 1 II 1 1 II II I M 

3901 TGGTCATGAGATTATCAAJ^AGGATCTTCACCTAGATCCTTTTAAATTAA 3 950 
3863 AAATGAAGTTTTAAATC AATCTAAAGTATATATGAGTAAACTTGGTCTGA 3 912 

IMIMIIIIIIII 1 1 1 MMMIMMMMMI III III II II M III 

3 951 AAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGA 4000 
3 913 CAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTAT 3 962 

MMMMIMI II 1 1 1 1 IMIIMIMMMIIMM III 1 1 1 1 II III 

40 01 CAGTTACCAATGCTTA-ATCAGTGAGGCACCTATCTCAGCGATCTGTCTAT 40 50 
3963 TTCGTTCATCCATA.GTTGCCTGACTCCCCGTCGTGTAGATAACTACGATA 4012 

MM MIMMIM II 1 1 IIMIIMM IIIIIMIMI II I II I II I M 

4051 TTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATA 4100 
4013 CGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCC 4062 

MM MIIIIIIIM 1 1 MIMIMIM IIIIIIIMM II 1 1 1 MM M 

4101 CGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCC 4150 
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4 063 ACGCTCACCGGCTCC AGATTTATCAGC AATAAACCAGCC AGCCGGAAGGG 4112 

II II II 1 1 1 1 1 1 II I MMM IIMilMlllliliilllllli i M i I! 

4151 ACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGG 4200 
4113 CCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATT 4162 

MM III I II I MMMMIMM IMMMMIMMMIMM III II 

4201 CCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATT 4250 
4163 AATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCG 4212 

Ml I ill ! 1 1 1 1 IMMMIM MIMMMMI! !IMI! illl I II 1 1 

42 51 AATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCG 4300 
4213 CAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTG 4262 

III 1 1 1 1 1 1 1 1 1 1 1 1 II III 1 1 II MIMM III I MIMI II II I II 1 1 

43 01 CAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTG 43 50 
4263 GTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGA 4312 

III 1 1 1 1 1 1 1 1 1 1 1 1 Mill) I Kill MMI II II II II III 1 1 1 II 1 1 

43 51 GTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTAC ATGA 4400 
4313 TCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGT 43 62 

M M II M 1 1 1 1! I ! M MM MMIIMMMII II! ! MM I M II M 

4401 TCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGT 4450 
43 63 TGTC AGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGC AC 4412 

Illl Illl MM I IIIMMIIIMIIIMM MIMM Mill I II 1 1 1 

4451 TGTC AG AAGT AAGTTGGCCGCAGTGTTATCACTCATGGTTATGGC AGC AC 4500 
4413 TGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACT 4462 

II M I II M I II M I MM M I Ml M Mill II II Ml M M 1 1 1 II 1 1 

4501 TGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACT 4550 
4463 GGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAG 4512 

MM i Ml I Illl llllll IN Ml I MM MMM MMM MM Ml I 

4551 GGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAG 4600 
4513 TTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAA 4562 

II III I II M 1 1 1 1 i MIMM I M IMIMM II Mill I M ! M I M i 

4601 TTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAA 4650 
4563 CTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCA 4612 

M M M MMM 1 1 M M M MM IIMMlMMM | MM M 1 1 M I ! M 

4651 CTTTAAAAGTGCTC ATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTC A 4700 
4613 AGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACC 4662 

Mill M 1 1 1 II II MMIMMMMMIIIMIMMIMM I M I M I 

4701 AGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACC 4750 
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e-OJ 



4663 CAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAA 4712 

.., B , IIIIIIIIIIIIINIIIIIIIIIIIIIMIIIIIIIIillllillllli 

-» 7 5 1 CAACTG ATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTG AGCAA 4800 
4713 AAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGG^AA 4762 

jani IIIMIIMIIIIIIIIMIMIIIMIIIIIMMilliiiiiiiiiii 

4801 AAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAA 4850 
4763 TGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCA 4812 

ja51 INIillMIIIIIIIIIMIIIIIIIIIMIIIIIIIIIIMIUIMI 

4851 TGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCA 4900 
4813 GGGTTATTGTCTC ATGAGCGGATACATATTTGAATGTATTT \G AAA? a t A 4862 

Joni IIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIllMiMiinTii 

4901 GGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATA 4950 
.4863 AACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTC 4912 

lllllllllllllllllllllllllllllllllllllllllillllllll 

4951 AACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTC 5000 

4913 TAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTA^C^C 4962 

500! 'iiliiil'IIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIliril 

5001 TAAG AAAC C ATTATT ATC ATG AC ATTAACCTATAAAAAT AGGCGT ATC AC 5050 

4963 GAGGCCCTTTCGTCTCGCGCGTTTCGGTGATGACGGTG£ AAACCTC^G AC 5012 

^ JLI » JLJL M M 1 1 1 1 1 M 1 1 1 1 1 1 1 f 1 1 1 f 1 1 M I M M 1 1 ! 1 1 1 1 M ( 1 1 1 

5051 GAGGCCCTTTCGTCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGAC 5100 
5013 ACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGA^GCCGGG 5062 

,mi Uii'iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

5101 ACATGCAGCTCCCGGAGACGC-TCACAGCTTGTCTGTAAGCGGATGCCGGG 5150 
5063 AGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGG 5112 

'J. 'I'" ii i ii i iii ii ill ii mi iii mi ii in i ii 1 1 1! i in 

5151 AGCAGACAAGCCCGTC^GGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGG 5200 
5113 CTGGCTTAACTATGCGGCATCA.GAGCAGATTGTACTGAGAGTGCACCA.TA 5162 

„„, IMIMIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIUIII 

5201 CTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGC ACCATA 5250 
5163 TGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAAT^CCGCATCAGG 5212 

52qi IIIININMIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIMIIIIM 

5251 TGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGG 5300 
5213 CGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCG 5262 

„ ni IMIIMIIIIIIMIIIIIIIIIIIIIIIIIIIIlliiiiiMIIMM 

5301 CGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCG 5350 
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52 63 GGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGC AAGGC 5312 

I Ml II 1 1 1 1 1 1 1 II 1 1 M f 1 1 i M MM IMIM 1 1 i I ! 1 1 1 1 1 1 1 1 i I 

53 51 GGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGC AAGGC 5400 
5313 GATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACG 53 62 

MM II 1 1 M 1 1 II I M I M M 1 1 M M M M II 1 1 M M M M I M I M 

5401 GATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACG 5450 
53 63 ACGGCCAGTGCC 5374 

illinium 

5451 ACGGCCAGTGCC 5462 
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PI 



SV40 On 



Ad2 MLP 



Sa! I end 



3' linker 



Amp f 




Cal fit On 



SV40 pA 



OHFR 



EcoRI Sail 



Sal l/Eco Rl Digestion 



Eco Rl end 



— 5' linker 
T4 DNA ligase 



RNA Tag Seq. 



T 30 VN ~t 



Apa I 



Bgl It 



Vector-Primer 
(Quality assurance with Apa I and Bgl II Digestion) 



3' linker 

5 ' -CTAATCTGATCCGCTAGTGGTAC - 3 ' 
3 ' -NV(T) 3 Q G ATTAGACTAGGCGATCACCATGAGCT - 5 ' 

5' linker 

RNA Tag Sequence 
5 . -AATTCGAGTGAACACTCGAGCTCACTAGTGACCAGCTGATACGCCTCAAA-3 ' 
3 ' -GCTC ACTTGTG AGCTCGAG - 5 ' 
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Preparation of Primers-Attached-Vector 



SV40 Ori 



AdZMLP 



Kpn I end 



3' linker 



Amo r 




Col El On 



fl Ori 



EcoRI Kpn I 

Kpn l/Eco RI Digestion 



i 



Eco RI end 



i 



— 5 f linker 
T4 DNA ligase 



RNA Tag Seq. 



T48VN Bst XI Hind III Hind III 

Vector-Primer 

(Quality assurance with Hind III and Bst XI Digestion) 



3' linker 

5 ' -CTAATCTGATCCGCTAGTGGTAC - 3 ' 
3 ' -NV ( T ) 4 g GATTAGACTAGGCGATCAC - 5 ' V=A,C,G N=A,C,G,T 



5' linker 



RNA Tag Sequence 
5 ' - AATTC GAGTGAACACTCGAGCTCACTAGTGACCAGCTGATAC GCCTCAAA - 3 ' 
3 ' -GCTCACTTGTGAGCTCGAG- 5 ' 
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Hindlll 1 



amHI 345 



Nhel 2891 




BamHI 1243 
Clal 1358 

Nael 1488 



BamHI 1927 



Sapl 2135 



Plasmid name: pNOTs 
Plasmid size: 4529 bp 



Comments/References: pNOTs is a derivative of pMT2 (Kaufman et al, 1989. Mol.CelL3iol.9:«ert*30). 
DHFR was deleted and a new poryiinker was inserted between EcoRI and Hpal. M13 origin 
of replication was inserted in the Clal site. SST cDNAs are cloned between Ecofll and 
Not! 
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(A)n-OM 3* 
RNA Tag Seq. 



Vector-Primer 



^ Annealing 




RNA Tag Seq. 



,( A ) n - O M 3* 



NVT 



30 



■sv 



RNA Tag Seq. 




ss cDNA 



RNase I Digestion 
Streptavidin Capture 
RNase H Digestion 
Dilution/Denature/Renature 



ds Vector DNA 




Seq. 
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